Parallel Parsing for Unification Grammars
نویسنده
چکیده
The parsing problem for a rb i t r a ry un i f icat ion grammars is unsolvable We present a class of un i f i ca t ion grammars for whcih the parsing problem is solvable and a para l le l parsing algori thm for th is class of grammars 1. In t roduc t ion Uni f icat ion grammars have the power of a Turing machine, and one can easily prove this by showing that a un i f i ca t ion grammar can simulate any Prolog program It follows that the problem of f inding all possible parses of a sentence in a given un i f icat ion grammar is unsolvable The best we can do is an algori thm that sometimes finds a set of parses and sometimes goes in to an in f in i te loop The t op -down , l e f t t o r i g h t parser used with def in i te clause grammars is of th is k ind if the grammar contains left recurs ion the parser may run forever If we want to wr i te a paral le l parsing algori thm for un i f ica t ion grammar, we f i rs t need to f ind a subset of un i f i ca t ion grammar for which the parsing problem is solvable Indeed th is is the hard par t of the problem We shall see that once we have a parsing algor i thm, f inding the paral lel ism is s t ra igh t fo rward Part 1 reviews the algor i thm of Cocke. Kasami and Younger for parsing c o n t e x t f r e e grammars in Chomsky normal form This a lgor i thm is beaut i fu l ly simple and easily extends to a parsing algori thm for uni f icat ion grammars in Chomsky normal form Therefore the parsing problem is solvable for un i f ica t ion grammars in Chomsky normal form Unfor tunate ly th is subset of un i f i ca t ion grammar is too res t r i c ted to describe human language Part 2 there fore considers an extension of Chomsky Normal Form which allows chain rules rules having one non t e rm ina l symbol on the r igh t side We general ize the CKY algor i thm to handle con tex t f r ee grammars with chain ru les, and extend th is algorithm to un i f ica t ion grammars. The extension works only if one places a res t r i c t i on on the use of chain rules in a un i f i ca t ion grammar, and tha t res t r i c t i on is one main point of the paper Once we have the pars ing algorithm for un i f ica t ion grammars wi th chain rules, we can easily extend it to un i f ica t ion grammars wi th any number of symbols on the r ight side of a ru le Final ly we consider the possibi l i t ies of paral lel ism in the new parsing algor i thm 2. Parsing in Choaaky Normal Form A c o n t e x t f r e e grammar in Chomsky normal form contains two kinds of rules. Terminal rules have a single terminal on the r igh t side, branching rules have exact ly two non t e rm ina l symbols on the r ight side Since no ru le has an empty r igh t side, no symbol can generate the empty s t r ing We use the capi ta l le t ters A.B.C as var iables ranging over n o n t e r m i n a l symbols To describe the substr ings of an input sentence we number the spaces between words 0 is the space before the f i rs t word and n is the space after the n t h word If I < j. i npu t [ i j ] is the s t r ing of words between space i and space j The CKY algori thm bui lds a matr ix M such that M[i j] | A | A »>• i n p u t [ i j ] | Str ic t ly speaking th is is a recognizer not a parser, but it is easily extended to a parser, and the same is t rue for the other algori thms in this paper If SI and S2 are sets of non te rm ina l symbols define the product of SI and S2. Si * S2, by
منابع مشابه
Deterministic Shift-Reduce Parsing for Unification-Based Grammars by Using Default Unification
Many parsing techniques including parameter estimation assume the use of a packed parse forest for efficient and accurate parsing. However, they have several inherent problems deriving from the restriction of locality in the packed parse forest. Deterministic parsing is one of solutions that can achieve simple and fast parsing without the mechanisms of the packed parse forest by accurately choo...
متن کاملGuaranteeing Parsing Termination of Unification Grammars
Unification grammars are known to be Turingequivalent; given a grammar and a word , it is undecidable whether . In order to ensure decidability, several constraints on grammars, commonly known as off-line parsability (OLP) were suggested. The recognition problem is decidable for grammars which satisfy OLP. An open question is whether it is decidable if a given grammar satisfies OLP. In this pap...
متن کاملParsing Algorithms for Grammars with Regulated Rewriting
In recent papers [4, 5, 8, 11] Petri net controlled grammars have been introduced and investigated. It was shown that various regulated grammars such as random context, matrix, vector, valence grammars, etc., resulted from enriching context-free grammars with additional mechanisms can be unified into the Petri net formalism, i.e., a grammar and its control can be represented by a Petri net. Thi...
متن کاملA Parsing Algorithm for Unification Grammar
We describe a table-driven parser for unification grammar that combines bottom-up construction of phrases with top-down filtering. This algorithm works on a class of grammars called depth-bounded grammars, and it is guaranteed to halt for any input string. Unlike many unification parsers, our algorithm works directly on a unification grammar--it does not require that we divide the grammar into ...
متن کاملA Generalization of the Offline Parsable Grammars
The offline parsable grammars apparently have enough formal power to describe human language, yet the parsing problem for these grammars is solvable. Unfortunately they exclude grammars that use x-bar theory and these grammars have strong linguistic justification. We define a more general class of unification grammars, which admits x-bar grammars while preserving the desirable properties of off...
متن کاملReversible Unification Based Machine Translation
[n this paper it will be shown how unification g rammars can be used to build a reversible machine t ranslat ion system. Unification g rammars are often used to define the relation between strings and meaning representat ions in a declara t ive way. Such grammars are somet imes used in a bidirecLional way, thus the same grammar is used for both parsing and generation, in this paper 1 will show ...
متن کامل